PDX-473/471: fix(mcp) — all-level violations in stop decision; read-only diff#180
Merged
Merged
Conversation
…ouple read-only diff from save_results RCA: Four correctness gaps surfaced in the 1.5.1 PR #172 release review that either let recommended_next_action="stop" fire while violations remained, or broke read-only diff. (B1) testPlanValidate called calcNextAction(score, false) without the remainingViolationCount arg (defaults to 0), so a plan whose test cases were structurally valid returned "stop" even when PLAN-META-* or BP violations remained. (B2) testSuiteValidate.collectAllViolations collected tc.issues but not tc.best_practices_violations, so a suite with BP-only failures returned "stop" and an empty diff. (B3) projectValidateFromPath used currentViolations (which only contains top-level project_violations) for the stop decision, so a project with nested plan/suite violations returned "stop" when the project root was clean. (B4) baseline load was gated on save_results !== false, so a valid baseline_run_id returned BASELINE_NOT_FOUND in read-only mode — save_results should control persistence of the current run only, not whether existing runs can be read. (B5) docs/PROVAR_TOOL_GUIDE.md was copied into lib/mcp/docs/ during compile but was missing from wireit.compile.files inputs, so wireit could ship a stale guide after edits. Fix: Added countAllPlanViolations and countSuiteViolations helpers in testPlanValidate.ts and pass the all-level count to calcNextAction. Added tc.best_practices_violations to collectAllViolations in testSuiteValidate.ts. Added countAllProjectViolations helper in projectValidateFromPath.ts; the diff snapshot still uses project_violations to keep response shape stable, but the stop decision now uses the all-level count. Decoupled baseline load and hasAnyRun from save_results in projectValidateFromPath.ts so read-only diff works (matches the testCaseValidate.ts pattern). Added docs/PROVAR_TOOL_GUIDE.md to wireit.compile.files. Updated stale "stop when score=100" assertions in testPlanValidate.test.ts and testSuiteValidate.test.ts that were locking in the bug, and added new B1/B2/B3/B4 coverage. Validation: yarn compile clean, full mocha 1143 passing / 0 failing, yarn lint clean, 55/55 smoke pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quality Orchestrator🟢 LOW · 🧪 Tests to Run · Running 3 of 51 tests
▶ Run commandnpx vitest run \
unit/mcp/projectValidateFromPath.test.ts \
unit/mcp/testPlanValidate.test.ts \
unit/mcp/testSuiteValidate.test.ts⚡ quality-orchestrator · |
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes five correctness issues from the 1.5.1 release review of PR #172, primarily around the recommended_next_action stop signal incorrectly firing when violations remained at non-tested levels (plan metadata, suite-level, BP), and a read-only diff mode that erroneously required save_results !== false.
Changes:
- B1/B2/B3: Compute and pass an all-level violation count into
calcNextActionfor the plan, suite (viacollectAllViolationsaddingbest_practices_violations), and project validators sostopno longer fires while violations remain. - B4: Decouple baseline read from
save_resultsinprojectValidateFromPathso a read-only diff against a savedbaseline_run_idworks. - B5: Add
docs/PROVAR_TOOL_GUIDE.mdtowireit.compile.filesso wireit invalidates the cache on edits.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/mcp/tools/testSuiteValidate.ts | collectAllViolations now includes per-tc best_practices_violations so they feed the stop-decision count. |
| src/mcp/tools/testPlanValidate.ts | New countSuiteViolations/countAllPlanViolations helpers; pass remaining count to calcNextAction. |
| src/mcp/tools/projectValidateFromPath.ts | New countAllProjectViolations helper; baseline load and hasAnyRun no longer gated on save_results. |
| package.json | Add docs/PROVAR_TOOL_GUIDE.md to wireit.compile.files. |
| test/unit/mcp/testSuiteValidate.test.ts | Replace stale stop-when-100 assertion with B2 BP-violations-remain notEqual('stop') test. |
| test/unit/mcp/testPlanValidate.test.ts | Rework stale test for B1 (BP remain) and add new B1 test for plan-metadata-remain. |
| test/unit/mcp/projectValidateFromPath.test.ts | Add B4 read-only diff test and B3 nested-violations stop-decision test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Five correctness fixes surfaced during the 1.5.1 PR #172 release review.
calcNextAction(score, false)was called withoutremainingViolationCount, defaulting to0. A plan whose test cases were structurally valid returned"stop"while PLAN-META-*, suite-level, or BP violations remained. Now passes the all-level violation count via newcountAllPlanViolations/countSuiteViolationshelpers.collectAllViolationscollectedtc.issuesbut nottc.best_practices_violations. A suite whose test cases failed only BP rules returned"stop"and an empty diff. Mirrors the testcase-level fix from PR PDX-470/471/473: feat(mcp): add detail level, diff mode, and completeness score to validation tools #168 review for the suite collector.currentViolations(top-levelproject_violationsonly) was used for both the diff snapshot and the stop decision. A project with nested plan/suite violations but a clean root returned"stop". Diff snapshot is unchanged (stable response shape); stop decision now uses a newcountAllProjectViolationshelper.save_results !== false, so a validbaseline_run_idreturnedBASELINE_NOT_FOUNDin read-only mode.save_resultsshould control persistence of the current run only, not whether existing runs can be read. Decoupled to match the testcase tool's pattern.docs/PROVAR_TOOL_GUIDE.mdwas copied intolib/mcp/docs/bycompilebut missing fromwireit.compile.filesinputs, so wireit could ship a stale guide after edits.AC traceback
Test plan
"stop when score=100"assertions intestPlanValidate.test.tsandtestSuiteValidate.test.tsthat were locking in the bug — these passed pre-fix because the broken stop decision matched the broken assertionsave_results=false+ validbaseline_run_id→ returns diff (notBASELINE_NOT_FOUND)yarn compilecleanyarn lintcleannode scripts/mcp-smoke.cjs— 55/55 PASSNotes for reviewer
ValidatedTestCase(project-layer test case type) does not carrybest_practices_violations; that's a pre-existing limitation of the project-validation service mapping. The B3 helper counts every other violation surface and falls back gracefully — explicit comment incountAllProjectViolations. Surfacing BP at the project layer is a follow-on (would belong in PDX-487 or similar).assert.notEqual(not a positivestopassertion) because building a truly-zero-violations TC fixture in unit tests is brittle. The positivestoppath is covered byvalidationScore.test.ts(calcNextAction(100, false, 0) === 'stop').🤖 Generated with Claude Code